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1. Motivation 

The overall dynamics of a CLE can be described by a cascade of differential equations of the following form (see 
e.g. [1]): 

X fe (m)x fe =f fe (v,x fc ,...,x 1 ,x™ put ), fc=l...n, t e [0,T] (1) 
where the states x^ are functions of time t and take their values in R nk (they represent the cumomer fractions 
of each metabolite) and the constant vectors x !™ p " are vectors of R™<= P depend on the labeling of the input sub- 
strates. The vector v € R m denotes the unknown fluxes and are diagonal matrices containing unknown pool sizes 
corresponding to to cumomer fractions of weight k. 

The particular form taken by the functions depends on the transition pattern of carbon atoms occuring for each 
reaction in the metabolic network. Writing down by hand the expression of these functions is quite easy for a small 
sample network but becomes untractable for a realistic network. As far as numerical computations are concerned 
(direct problem solving or identification) the real concern is to write some specific computer code computing these 
functions and their exact derivatives with respect of states xi, . . . , x„ and v, in a target language. The formal expres- 
sion of the overall system could be interesting for testing the identifiability of the flux vector v and the pool sizes, 
but previous work shows that the size of realistic networks prevents the use of classical algorithms based on symbolic 
computations. 

Nowadays, the most efficient way of describing a metabolic network is to use the Systems Biology Markup Lan- 
guage (SBML, see [2], [3]), as it has become the de facto standard, used by a growing number of commercial or 
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open source applications. The SBML markup language is a dedicated dialect of XML (see e.g. [4]) with a specific 
structure which allows to describe the different compartments, species, and the kinetics of reactions occuring between 
these species. Transformations can be applied to the SBML file describing the network, described in another XML 
dialect, the eXtendted Stylesheet Language (XSL), and the kind of transformations we are interested in, are those 
which will allow to generate the specific numerical code we need to solve the identification problem (stationnary and 
instationnary). The generated computer code is specific for each particular metabolic network and associated CLE, an 
thus more efficient, readable and resusable, than a general application written to cover all possible cases. 

The target language which has been chosen is Scilab (see [5,6]) because it is an open-source Matlab compatible 
language, allowing high-level programming together with compiled libraries with performant differential equation 
solvers, optimization routines and efficient sparse matrix algebra. The GUI of the final application is also described in 
XML, using another specific dialect called XMLlab (see [7]), which is available under the form of an official Scilab 
Toolbox (see Scilab www site). The description of the GUI is obtained with another pass of XSL transformations on 
the original SBML file describing the network. This GUI allows the biologist to enter the input data (label input of the 
substrate, known fluxes or a priori relationships betwen them, label observation), and lauch the optimization process 
solving the identification problem. 

Throughout this paper, we will use a very small example to illustrate our approach. This is the branching network 
used by Wiechert and Isermann in [8] (see Figure 1). The SBML file corresponding to this network can be found in 



v 4 : D + D ->■ F 
#i + #j^#ij 
v 5 : F -> G 
#ij^#ij 
«6 : A_out — » A 
#ij^#ij 

T 

c 

Figure 1. The branching network, with associated fluxes vx, V2, V3, 1)4, V5, vg and the carbon atoms transition. 

Figures A. 1 and A.2 in the appendix section. As it can be seen, the description is very verbose. The added material 
describing some informations on the CLE (label input, label observation, carbon atom mapping) are entered in the 
species and reaction notes, directly from the CellDesigner interface. In the future, we plan to develop a plugin to 
directly enter these information from within CellDesigner and create XML annotation in the SBML file. . 

2. Mathematical modelling in the stationnary case 

It has been shown in [1] that the actual state equation is in fact a succession of linear ordinary differential equation, 
where for each k the non homogeneous part of the right handside depends on xi, . . . , x/-_i, giving the following 
cascade 

X fc (m)ifc(t) =M fc (v)x fc (i)+b fc (v,x fe _ 1 (<),...,x 1 (t),x™ put ), k = l...n,t>0. (2) 

Each component of vectors X& (i) represents a cumomer fraction of weight k of a given species (for a proper defini- 
tion of cumomer and cumomer weight see [1]). The constant vectors x™ pa contain the cumomer fractions of weight 
k of species which are input metabolites. The diagonal matrices Xfc(m) depend on the stationnary concentrations 



A_out 



t 

F 



v\ : A4F 
#ij^#ij 

v 2 : A -> D + D 
#ij -> #i + #j 

v 3 : A -> F 
#ij^#ji 
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of metabolites. The matrix Mfe(v) and the vector are constructed by considering the balance equation for each 
cumomer of the vector Xfe. Constructing these matrix by hand is a very tedious task but can be automated if adequate 
data structures are used, in order to represent the metabolic network and the carbon transition map for each reaction 
(we will explain later how we deal with this particular information). 



In the stationnary case, the CLE is considering the asymptotic behaviour of the system (2), 

= M fe (v)x fc +b fc (v,x fc _ 1 ...,x 1 (t),x" ptI '), k = l...n, (3) 

in this case the states X& do not depend on time anymore and the identification problem is restricted to the determi- 
nation of the flux vector v such that some cost function is minimized. For a given flux vector v this cost function can 
be classically defined as the squared norm of the difference betwen an observation vector y meas g ^ji meas anc j the 
corresponding synthetic observation y(v) computed by solving the state equation (3) for the given value of the fluxes. 

In the following, we will consider that this observation is composed of isotopomer and cumomer fractions of given 
species, which can always be computed as linear combination of cumomers, i.e. there exists n non zero matrices 
Ci, C 2 , . . . , C„ such that 

n 

y( v ) = ^ CfcXfe ( v )' 

k=l 

where we have used Xfc(v) to denote the solutions of the state equation (3) for a given flux vector v. 

We have also natural constraints on the flux vector which result of the particular structure of the metabolic network 
(the stoichiometric balances) and some other kinds of linear constraints on the fluxes, which can express that some 
of the fluxes are fixed, for example the flux of input substrate, or some more specific information, e.g. some linear 
combination of fluxes which should be zero. Thus, the constraints on v take an affine form 

Av = w. 

There is usually some measured extracellular fluxes v meas , which has to be compared with the actual value of these 
fluxes, which can be always be expressed as linear function of v. The mostly used cost function is the Chi-Square 
function 

J ( v ) = \ IK 1 (y( v ) - y meas )\\ 2 + \ IK 1 (Ev - v meas )\\ 2 

where a and a are diagonal positive definite matrices containing the standard deviation for each observation. Minimiz- 
ing the Chi-Square function, under the hypothesis of gaussian distribution, is equivalent to maximizing the likelihood 
of measurements. 

v = arg min J e (v). 

v£l m 

< Av = W, (Pe) 

v > 0, 

where y(v) = J2k=i CfcX fc (v) and x fc (v), k = 1 ... n are the solutions of the state equation (3). 



2.1. Parametrisation of the admissible fluxes subspace 



The subspace of admissible fluxes is determined by the system of equations and inequations 




(4) 

In order to detect any redundancy or incompatibilities due to the eventual complimentary constraints added by the 
user, an admissibility test is done on the system. We do it by solving a trivial linear program, which allows to test if 
w is in the range of A and then if the subspace (4) is non-void. 
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Then there are two possibilities to obtain a parametrization : by computing an orthonormal basis {V 1 , . . . V} of 
the kernel of A (where p — rank A) and the minimum norm solution vo of Av = w, any v satisfying (4) can be 
expressed as 

v = Vq + v , 

where q is a vector of size to — p. 

The classical parametrization, using the free fluxes, can be found by computing the QR factorization of A. There 
exists an m x to permutation matrix P = [P 1; P 2 ] and an orthogonal square matrix Q = [Q 1; Q 2 ], where Pi and 
Qi represent the first p columns of P and Q, and a full rank p x p upper-triangular matrix Ri such that 



AP = Q 



Ri 


R 2 









where the lower right zero block is absent if A has full rank. The free fluxes in the vector are given by q = P 2 v 
and the complimentary dependent fluxes are given by P^v. Straightforward computations give the parametrization 
v = Vq + v , where q has to — p components and 

V = P 2 PiRr 1 R 2 , v = PiR^Q^w. 



Such parametrizations remove any redundancy in the constraints, and allow to identify which constraints in Vq + 
vo > are equality constraints blocking the value of some fluxes. Some of them are not specified in the initial system 
(4) but are added de facto in the parametrization because of the implicit fluxes balancing constraints. This situation 
can be detected when a row of Vj is equal to zero. Hence, if we define the sets / = {i, Vj ^ 0} and the set D 
containing the indices of dependent fluxes, the parametrized optimization problem takes the form 

q = arg min J e (Vq + v ), 

qeRm - P 

Qi > 0, i = 1 . . . m — p, (5) 
. v,q+ (v„) t >o, ieinD, 

and we have v = Vq + v . The gradient of the parametrized cost function can be expressed via the chain rule as 

— J £ (Vq + v )j = V T VJ £ (Vq + Vo ). 

The type of parametrization (orthogonal or free fluxes) used in the optimization does not seem to influence the condi- 
tioning of the algorithm, so the free fluxes parametrization is used, because of its biological interpretation. 

2.2. Identifiability and regularization 

When e = the constraints on q are not enough to ensure existence of a solution because the problem may be 
unbounded. In fact, existence and unicity of a solution will occur if the fluxes are identifiable. A general discus- 
sion about this subject can be found in [8], where the authors propose an algorithm based on integer arithmetics to 
test the structural identifiability of metabolic networks. The most encoutered problematic situation corresponds to 
bidirectional reactions, such as 

vi : A -> B, v 2 : B -> A 

where V\ — v 2 (the net flux) is identifiable but V\ and v 2 are not individualy identifiable. The counterpart of such a 
situation is that the optimal v\ and v 2 tend to infinity when e — > 0, and the cost function J e is ill-conditioned when 
e is too small, leading to convergence problems in the optimization phase. The change of variables proposed in [9] 
considers the net flux v net — v\ — v 2 and the exchange fluxes v xc h = min(wi,w 2 ) and a "compacification" of v xc h 
defined by 

Vxch 

V[ °- 1] = J+^c~h 
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where /3 > 0. The above change of variables maps [0, +oo[ to [0, 1[ and thus is interesting from a numerical point of 
view. Although these new variables make sense from a metabolic point of view, it remains that the overall mapping 
from (v\, v 2 ) to (v net , W[o,i]) i s not differentiable and thus needs to be approximated. A more systematic approach is 
proposed in [10] where all free fluxes qi > are mapped to r t e [0, 1[ with the change of variables q = q(r), where 

q l = 8 , i = l...m-p, 

l-n 

where j3 > is a scaling parameter. In this case, the inequality constraints in (5) become non linear and the new 
optimization problem is 

v = arg min J e (Vq(r) + v ), 

1 - 5 > n > 0, i = 1 . . . m - p, ( 6 ) 
V jq (r) + (v ) 4 > 0, ieinD, 
where S > can be arbitrary small. The gradient of the cost function is given by 

d 



dr 



J £ (Vq(r) + v ) ) = q'(r) T V T V J £ (Vq(r) + v ). 



2.3. Multiple experiences 

In the following we will also consider the case where multiple CLE are done with the same metabolic network but 
with diffferent labeling of the input metabolites, given by x mp "*'* for i = 1 . . . n exp . Thus, we will consider the cost 
function 

n e , P 

J -( v ) = 2 £ (H^ 1 (y( v . xinput,< ) - y meas ' 1 ) f + IK 1 ( Ev - vobsM 2 ) + flMl 2 

i=l 

where y meas ' 1 i s the observation of labeled material for experience i, v b s ,i is the vector of measured extracellular 
fluxes and 

n 

y ( VjX mp«M) = ^ Cfe X fc (v,X m ^) 
fe=l 

and x fe (v, x m P ut ' 4 ), k = 1 . . . n is the solution of 

= M fc (v)x fc + b fc (v,x fc _ 1 ,...,x 1 (t),x™ put ^), fc = l...n, (7) 

2.4. Computation of the gradient of the cost function 

The computation of the gradient of J(v) needs the derivative of x(v) with respect to v. In the stationnary case, it 
makes sense to compute this derivative by implicit differentiation of the state equation (3). To this purpose, we adopt 
the notation 

f fc (v, x fc , . . . , Xl , x«) = M fc (v)x fc + b fc (v, x fc _! , . . . , X! , x«)and (8) 
we denote by x* (v) the solution of 

f fe (v,x fe ,...,x 1 ,x«) = 0, k = l...n. 
By differentiating these equations with respect to v, when x = x 2 (v), we obtain for k = 1 ... n 

h 

Ui fc / input A \ / input,i\ , \ A < - /1 ? / input A \ ^^-l /rw 

= ^(v,x fc ,..., Xl ,x fe p ) = — (v,x fe ,..., Xl ,x fc p )+^^(v,x fc ,..., Xl ,x fc p ')^. (9) 



Since ffc is linear with respect to X& for fixed v, we can determine as the solution of a linear system of equations, 
whose right hand side is a function of x; and ^ for I = 1 . . . k: 

M*(v)^ = g(v,x,,..., Xl ,x^ (10) 

Hence, the key ingredients in the computation of the derivatives of x fe are the derivatives for I < k and the 
derivatives This will be one of the main tasks of the automatically generated computer code, together with the 
assembly of matrices Mfc(v). Since the gradient of the cost function J(v) will be required at each iteration of the 
optimization algorithm, these matrix will be assembled as sparse matrices in order to speed up the computations, 
particularly the resolution of the linear systems (10). 



The final computation of the derivative gives 

dJ ^f> = (Ev - v obs ,) T a" 2 E + £ v T + (y(v, x«) - y™° S ) T £C fe ^. 

i=\ k = l 



3. Architecture of the computer code generation algorithms 



The most innovative aspect of this work is the choice of the techniques to generate the code : from the original 
SBML file edited under Cell Designer (or any other SBML compliant software), only XSL (eXetended Stylesheet 
Language) transformations are used to generate the Scilab code computing the specific objects for a given Carbon 
Labeling Experiment. The way transformations are done is described in XSL stylesheets, written in anthor XML 
dialect. XSL is very different from the typical programming languages in use today. One question that's being asked 
frequently is : what kind of programming language is actually XSLT ? Until now, the authoritative answer from some 
of the best specialists is that XSLT is a declarative (as opposed to imperative) language. The XSL stylesheets are thus 
very explicit, human readable, and easy to debug and maintain. 



In the whole process which maps the original SBML file to the computer code and the graphical user interface, 
successive XSL transformations occur. The first set of transformations aims to translate all the specific information 
about the CLE into XML markup which can be later used, for example, the carbon atom mapping of each reaction 
(this step is described in Appendix A). Then two different main paths are followed : 



(i) The first series of transformations is dedicated to the computer code generation : 

(a) A main assembly loop is processed, which for each weight k, generates, for j = 1 . . . nj, some intermedi- 
ate XML markup declaring the contribution of each cumomer fraction (x^ to the matrices 

The contributions to Mfe(v) are functions of the flux vector v only, but the contributions to other matri- 
ces are functions of lower weight cumomer components and eventually of x™ pMt , respecting the weight 
preservation property (see [1]). This intermediate step is event-driven, i.e. contributions to the matrices 
are dumped in the order they occur. The contributions are gathered for each matrix in the subsequent 
transformation which produces the Scilab code. 

(b) The Scilab code solving the cascaded linear systems is generated. For each weight k, the matrix (v) 
is stored as a sparse matrix and its sparse LU factorization is computed (the sparse triangular factors are 
also retained because they are also needed to compute the derivatives). The linear system giving x^ is 
then solved by using the precomputed sparse LU factors. Then we compute the matrices dfk/dv and 
(dh/dxi) i=1 fe _ 1 , which need the previously computed cumomer vectors Xi, . . . , x fc _! and finally solve 
the linear system giving dx k / dv. 
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(ii) The second series of transformation aims to build the specific graphical user interface for the given metabolic 
network. To this purpose, an XML file conforming to the XMLlab DTD is generated. The structure of the inter- 
face is described in a high level way : there are given sections of the interface, each of these being dedicated to 
different purposes. The first section is hosting all the fluxes, the second section is hosting the fluxes observations 
with associated standard deviation, the third section hosts together the label measurements and the label output 
corresponding to the current fluxes. This is the place where the user can compare the original measurement and 
the reconstructed measurement after the identification process. The fourth section displays all components of 
the cumomer vector, sorted by weight and by species name. The last section is reserved to the parameters of the 
identification method (maximum iteration, regularization parameter, and so on). The structure of the original 
SBML file, enriched with the specific annotations in the sbml namespace (see Appendix A), allows to perform 
this step in a very straightforward way. 

The generated Scilab code computing the cumomers vector x, the derivative matrices, the cost function and its gradient 

are given on Figures A. 6 and A. 7 in Appendix A. 



4. Numerical results 



5. Mathematical modelling in the unstationnary case 



In the unstationnary case, the CLE is done when the system (2) has not reached its asymptotic behaviour. The 
measurements 

y meas ' j , j i. ••".<• 

are done at different values tj of time, and we can make the hypothesis that the final time T in (2) is equal to the 
final observation time i.e T = t nt . The fundamental difference with the stationnary case is that the stationnary 
concentration of metabolites is also an unknown, i.e. the vector m is also to be determined. 
The cost function takes the form 

J £ (v,m) = - J2 (|k _1 (y(*;> v,m) -y"™')f) + - ((a" 1 (Ev - v meas )|| 2 , 

where 

n 

y(ij,v,m) = ^CfeXfc^i.v.m), 
fe=i 

and x/j(t, v, m), k = 1 . . . n are the solutions of the state equation for a given pair (v, m) of fluxes and pool sizes : 

Xfc(m)x fc (i) = M fc (v)x fc (t) + b fc (v,x fc _i(t), . . . ,x 1 (t),x™ put ), k = 1 . . .n, t > 0. (11) 
The minimization problem is the the following : 

(v,m) = argmin J e (v,m), 

V.II1 

AV = (J?) 
v > 0, 

m > 0. 

The main difficulty is the computation of the gradient of J, which can be done by using the sensitivity matrices 
- , computed as the solution of a system of differential equations obtained by implicit differentiation of (11) as 
in the stationnary case. The problem is that this approach is computationnaly intensive (see e.g. [9,1 1,12]) because it 
involves a cascade of differential equations where the state is a matrix (instead of a vector). 

A more suitable approach for non-stationnary problems is to use the adjoint state method (see [13,14,15]). If the 
number of parameters of interest (the fluxes) exceeds the number of model outputs for which the sensitivity is desired, 
the adjoint method is more efficient than traditional direct methods of calculating sensitivities. The gradient of J 
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can be computed at the same cost as the state equation (11). A far as the statistical evalution of identified fluxes 
is concerned, the sensitivity of the model output y(t, v, m) with respect to v can be obtained with a cost of n meas 
state equations. Hence, the adjoint method will always outperform the sensitivity method for the computation of the 
gradient. For the output sensitivity, the dimension of y(t, v, m) observations is generaly lower than the number of 
fluxes, so the same method should be used. The adjoint state method is best understood in continuous time and the 
next section is devoted to this presentation. Section 5.2 will detail its practical implementation in discrete time. 

5.1. The adjoint equation in continuous time 

In order to simplify the presentation of the results, we will adopt the block notation x = (x x ; x 2 ; . . . x„) for the 
overall cumomer vector, and for the state equation we will write 

X(m)x(i) - f (x(t), v) = 0, t e [0, T[, (12) 

where f (x, v) = (fi(x, v); . . . f„(x, v)) where f fe is defined by (8). We also define C = [Ci, C 2 , . . . , C„] so that 
y(t) = Cx(£). Without loss of generality, we consider only one measurement at final time T. 



The adjoint state method allows to compute the total derivative with respect to v and m of a given function 
J(x(v, m)) G R where x(v, m) is the solution of the state equation (12). If the gradient of J e is to be computed then 
we will take 

/(x) = i||a- 1 (Cx(T)-y meas )|| 2 . (13) 

In the following, we do not consider the quantities in J e depending explicitely on v hence, we define J(v, m) = 
I(x(v, m)). Let us define the Lagrangian 

L(x,p,v,m) =/(x)+ / p(i) T (X(m)x(t)-f(x(t),v))di, (14) 
Jo 

where the adjoint state p = (pi; p 2 ; . . . p„) has the same block structure as x. The first remark that can be done is 
that when x is the solution of the state equation (12), we have 

L(x(v,m),p,v,m) = J(v,m), 

and when we express the total derivative of J(v, m) e.g. with respect to v we have 

dJ(v, m) dL . , <9x(v, m) dL . , , 

1 = ^-(x(v,m),p,v,m) h — (x(v,m),p,v,m). (15) 

rfv ax av ov 

The idea of the adjoint state technique is to compute p such that |^ = 0. Then the remaining part of the derivative 
can be computed in a straightforward way. This adjoint equation is a (backward in time) differential equation given 
by 

X(m)p(i) = -(£(x(i),v)) p(t), te [0,T[, (16) 

with the final conditon 

p(T) = -C T a- 2 (Cx(T) - y meas ) . (17) 

Because of the block triangular structure of the adjoint equation has also a cascade structure, but in reverse order, 
i.e. p„ is obtained at first and pi at last : 

p k (T) = -Cja- 2 (C X (T)-y meas ), (18) 

5 V 1 



X k (m)p k (t) = -Mjp k (t)+ («- -) Pi(t), t €]0,T[, (19) 

i= k +i V x fe 



for k = 1 . . . n. 
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When for a given pair (v, m) the state equations and the adjoint state equations are solved, then the gradient of J 
can be readily computed by using (15) 



dJ(v, m) dL i 
dv dv 



(x,p,v,m)=V/ p k (t) T ^ -(v,x(t))dt. 
k=i Jo 0v 



We will give the derivative with respect to m in the next section. 

Mi 

dv 

have 

8y(T) " 



Remark 1 The output sensitivity dy a^ can be computed in the same way by taking J(x) = Cx(T). In this case, we 



dv 

fe=i 

where the final condition (18) is replaced by Pfe(T) = — and the adjoint equation (19) is unchanged, but pfc(i) is 
a matrix of size x n meas . 

5.2. The adjoint equation in discrete time 

The previous sections shows that once the state equation is solved, the gradient of J can be computed at the cost of 
one more differential equation (16), which has to be compared to the cost of computing the sensitivity function. But 
the practical implementation needs to reconsider this approach in discrete time, since we cannnot just approximate 
independently the continuous state and the continuous adjoint state equation, i.e. the discretized adjoint must be the 
adjoint of the discretize state. This is a reason why high order integration schemes are seldom used in adjoint codes 
written by hand (otherwise automatic differentiation can be used, see [16]) since the discrete adjoint is obtained by 
formal derivation of the state integration scheme. Since we have to consider that the state equation could be stiff 
because of eventual large values of fluxes, a good compromise is the implicit trapezoidal rule, which is of order 
2. Hence, we consider a discretization of the interval [0,T] by considering t t = (i — l)h, for i = 1...N and 
h = T/(N — 1), and we denote by x 1 the approximation of x(tj). The implicit trapezoidal rule applied to equation 
(12) gives 

D(m)(x m -x 4 ) - ^(f(x m ,v) + f(x\v)) =0, i = 1...N- 1, (20) 

and we still denote by x(v, m) the solution of (20). We consider that for each measurement time {Tj} 1<:; -< nt there 
exists 9(j) such that Tj = tg^y with 9(n t ) — N, and we define the cost function J(v, m) = /(x(v, m) where 



3 = 1 

The discrete Lagrangian is defined by 

N-l 



7 W = 2^lr 1 (Cx 9(j) -y measj ) 



L(x, p, v, m) = J(x) + ]T (p*) T (D(m)(x* +1 - x*) - J(f (x*+\ v) + f (x*, v))), (21) 

where p l is the adjoint state for time i. Straightforward computations show that the adjoint equation is given by 

X (m) - *g<*<, vf) P- = (X, m) + *g<*<, v) T ) p< - (g)" , 1 < i < N, (22, 
with the final condition 

(x (m >-^,vf)p- = -(^) T , 



where = if 9(j) ^ i for all j = 1 . . . n t and 



dl 
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otherwise. Once the state and the adjoint state equations are solved, the gradient is given by 

dJ(v, m)\ T h } ( df 



dv 



i—l x ' 



and 

T N-l 



(^) <*->(-"-'») p'. 



i=l 

where for a given vector z the matrix ^ (X(m)z) is defined by 



d ^\ | Zi, if the cumomer fraction belongs to metabolite j, 

dm ^ 



(X(m)z)] =^ (24) 
i? 0, otherwise. 
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Appendix A. Workflow of SBML markup to Scilab code 

The specific annotation in the private namespace xmlns : smtb="http : / /www . utc . f r/sysmetab" names- 
pace concerning the carbon atom mapping of each reaction is done as follows : for example, for the reaction corre- 
sponding to flux v 2 , 
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v 2 : A -)• D + D 



#ij -> #i + #j 



some specific markup is generated from the string I J > I+J found in the reaction <notes> element, as 
depicted in Figure A. 3. 

For each intermediate species, we also add some markup specifying the exhaustive list of its cumomers. For 
example, for species A in the branching network, we have the cumomers A x i, Ai x , An (we don't take into account 
A xx which is equal to 1), and the <species> element corresponding to A is enriched as depicted in Figure A. 4. Each 
<smtb : cumomer> element has an id of the form A„ where n is equal to the number whose base 2 representation 
is equal to the cumomer pattern when replacing the x's by zeros. Each <smtb : carbon> element denotes a 13 
neutrons isotope at position given by the position attribute in the molecule. 

When we consider the vectors of intermediate species cumomer fractions x,t for weights up to 2 for the branching 
network, we have 



A redundant enumeration is also generated (see the <smtb : listOf IntermediateCumomers> element on Fig- 
ure A.5) giving the ordering of all cumomers sorted by weight, allowing to keep the correspondance between compo- 
nents of vectors xi, x 2 and corresponding species cumomers (this information is needed in the subsequent transforma- 
tions). A similar enumeration is also generated for input species cumomers in the <smtb : 1 ist Of Input Cumomer s>, 
giving the correspondance between the components of vectors x ™ ptl * and corresponding cumomers : 




input 



(A_out x i, 



A_outi x ) , x l 2 ' 



input 



(A_out n ) 



T 
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1 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 
43 
44 
45 
46 
47 
48 
49 
50 
51 
52 
53 
54 
55 
56 
57 
58 
59 
60 
61 



<?xml version=" 1 .0" encoding="UTF-8" ?> 

<sbml level="2" version="l" xmlns=" h 1 1 p : //www. sbml . org / sbml / le v el2 " 
xmlns:ns="http: //www. sbml.org/sbml/level2" 

xmlns:celldesigner="http: //www. sbml .org/2001/ns/celldesigner"> 

<model id=" branching "> 
<listOfCompartments> 

<compartment id="default" /> 
</listOfCompartments> 

<listOfSpecies> 

<species compartment=" default " id="A"/> 

<species compartment=" default " id="D"/> 

<species compartment=" default " id = "F"> 
<notes> 

<html xmlns="http: //www. w3 . org / 1 99 9/ xhtml "> 

<body> LABELJVTEASUREMENT lx,xl,ll </body> 
</html> 
</ notes> 
</ species> 

<species compartment=" default " id = "G"/> 

<species compartment=" default " id = "A_out"> 
<notes> 

<html xmlns="http: //www. w3 . org / 1 999/ xhtml "> 

<body> LABEL JNPUT 01,10,11 </body> 
</html> 
</ notes> 
</ species> 
</ listOf Species> 

<listOfReactions> 

<reaction id="vl" reversible =" false "> 
<notes> 

<html xmlns="http: //www. w3 . org / 1 99 9/ xhtml "> 

<body> IJ > IJ </body> 
</html> 
</ notes> 

<HstOfReactants> 

<speciesReference species ="A" /> 
</ HstOfReactants> 
<HstOfProducts> 

<speciesReference specie s="F"/> 
</ HstOfProducts> 
</ reaction> 

<reaction id="v2" r e v e r s i b 1 e =" f al s e "> 
<notes> 

<html xmlns="http: //www. w3 . org / 1 999/ xhtml"> 

<body> IJ > I+J </body> 
</html> 
</ notes> 

<listOfReactants> 

<speciesReference species="A" /> 
</ HstOfReactants> 
<HstOfProducts> 

<speciesReference species="D" s toichiome tr y =" 2 . " /> 
</ HstOfProducts> 
</ reaction> 



Figure A. 1 . The XML file describing the branching network, with adde^ information concerning the carbon mapping for each reaction, and the 
detail of input and measured label (lines 1 to 61) 



<reaction id="v3" r e v e r s i b 1 e =" f a 1 s e "> 
<notes> 

<html xmlns="http: //www. w3 . org / 1 99 9/ xhtml "> 

<body> IJ > JI </body> 
</html> 
</ notes> 

<HstOfReactants> 

<speciesReference species ="A" /> 
</ HstOfReactants> 
<HstOfProducts> 

<speciesReference specie s="F"/> 
</ HstOfProducts> 
</ reaction> 

<reaction id="v4" r e v e r s i b 1 e =" f a 1 s e "> 
<notes> 

<html xmlns="http: //www. w3 . org / 1 99 9/ xhtml "> 

<body> I+J &gt ; IJ </body> 
</html> 
</ notes> 

<HstOfReactants> 

<speciesReference species="D" stoichiometry ="2" /> 
</ HstOfReactants> 
<HstOfProducts> 

<speciesReference species="F"/> 
</ HstOfProducts> 
</ reaction> 

<reaction id="v5"reversible=" false "> 
<notes> 

<html xmlns="http: //www. w3 . org / 1 99 91 xhtml "> 

<body> IJ &gt ; IJ </body> 
</html> 
</ notes> 

<listOf'Reactants> 

<speciesReference specie s="F"/> 
</ HstOfReactants> 
<HstOfProducts> 

<speciesReference species ="G" /> 
</ HstOfProducts> 
</ reaction> 

<reaction id="v6" r e v e r s i b 1 e =" f a 1 s e "> 
<notes> 

<html xmlns="http: //www. w3 . org / 1 99 9/ xhtml "> 

<body> IJ > IJ </body> 
</html> 
</ notes> 

<HstOfReactants> 

<speciesReference specie s="A_out"/> 
</ HstOfReactants> 
<HstOfProducts> 

<speciesReference species="A" /> 
</ HstOfProducts> 
</ reaction> 

</ HstOfReactions> 



</ model> 
</sbml> 



Figure A. 2. The XML file describing the branching network, with added information concerning the carbon mapping for each reaction, and the 
detail of input and measured label (lines 62 to 121). , n 



1 


<reaction position="2" id="v2" 


name=" v2 " r e v e r s i b 1 e =" false "> 








2 


<HstOfReactants> 










3 


<speciesReference species = 


'A"> 








4 


<smtb:carbon position="2 


' de s tination=" 1 " occurence=" 1 " 


species = 


'D' 


/> 


5 


<smtb:carbon position="l 


' de s tination=" 1 " occurence="2" 


species = 


'D' 


/> 


6 


</ specie sReference> 










7 
8 


</ listOfReactants> 
<listOfProducts> 










9 


<speciesReference species = 


'D"> 








10 


<smtb:carbon position="l 


' destination="2" occurence=" 1 " 


species = 


'A' 


/> 


11 


</speciesReference> 










12 


<speciesReference species = 


'D"> 








13 


<smtb:carbon position="l 


' de s tination=" 1 " occurence=" 1 " 


species = 


'A' 


/> 


14 


</ specie sReference> 










15 


</ listOt'Products> 










16 


</ reaction> 











Figure A.3. Fragment of the XML file transformed after the first set of XSL transformation: the atom mapping is translated into XML markup in 
the namespace smtb 



1 


<species compartment=" default " id="A" 


name="A' 


type=" intermediate" carbons="2"> 


2 


<smtb:cumomer id="A_l" species="A" 


weight^' 


1" pattern="xl "> 


3 


<smtb:carbon po s i t i on =" 1 " /> 






4 


</ smtb:cumomer> 






5 


<smtb:cumomer id="A_2" species="A" 


weight=' 


1" pattern=" lx"> 


6 


<smtb:carbon po s i t i on ="2" /> 






7 


</ smtb:cumomer> 






8 


<smtb:cumomer id="A_3" species="A" 


weight=' 


2" pattern="ll"> 


9 


<smtb:carbon po s i t i on =" 1 " /> 






10 


<smtb:carbon po s i t i on ="2" /> 






11 


</ smtb:cumomer> 






12 


</ species> 







Figure A.4. Fragment of the XML file transformed after the first set of XSL transformation: the cumomers of a given intermediate species are 
named and their labeling is described using jsmtbxarbon^ elements, in the namespace smtb 
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<smtb:listOfIntermediateCumomers xmlns 
<smtb:listOfCumomers weight="l"> 
<smtb:cumomer id = "A_l" species="A" 

<smtb:carbon po s i t i on =" 1 " /> 
</ smtb:cumomer> 

<smtb:cumomer id="A_2" species="A" 

<smtb:carbon po s i t i on ="2" /> 
</ smtb:cumomer> 

<smtb:cumomer id="D_l" species="D" 

<smtb:carbon po s i t i on =" 1 " /> 
</ smtb:cumomer> 

<smtb:cumomer id="F_l" species = 

<smtb:carbon po si ti on =" 1 " /> 
</ smtb:cumomer> 

<smtb:cumomer id="F_2" species = 

<smtb:carbon po s i t i on ="2" /> 
</ smtb:cumomer> 
</smtb:listOfCumomers> 
<smtb:listOfCumomers weight="2"> 
<smtb:cumomer id="A_3" species="A 
<smtb:carbon po sition=" 1 " /> 
<smtb:carbon po s i t i on ="2" /> 
</ smtb:cumomer> 

<smtb:cumomer id="F_3" species="F 
<smtb:carbon po s i t i on =" 1 " /> 
<smtb:carbon po s i t i on ="2" /> 
</ smtb:cumomer> 
</smtb:listOfCumomers> 
</smtb:listOfIntermediateCumomers> 
<smtb:listOt'InputCumomers xmlns :smtb = 
<smtb:listOfCumomers weight="l"> 

<smtb:cumomer id="A_out_l" species = 

<smtb:carbon po s i t i on =" 1 " /> 
</ smtb:cumomer> 

<smtb:cumomer id="A_out_2" species = 

<smtb:carbon po s i t i on ="2" /> 
</ smtb:cumomer> 
</smtb:listOfCumomers> 
<smtb:listOt'Cumomers weight="2"> 

<smtb:cumomer id="A_out_3" species: 
<smtb:carbon po s i t i on =" 1 " /> 
<smtb:carbon po s i ti on ="2" /> 
</ smtb:cumomer> 
</smtb:listOfCumomers> 
</smtb:listOf!nputCumomers> 



smtb="http: //www. utc . fr/sysmetab"> 

weight="l" pattern="xl" p o s i t io n =" 1 "> 

weight="l" pattern="lx" position="2"> 

weight="l" pattern="l" position="3"> 

F" weight="l" pattern="xl" position="4"> 

F" weight="l" pattern="lx" position="5"> 

weight="2" pattern="ll" p o s i t io n =" 1 "> 

weight="2" pattern="ll" position="2"> 



h t tp : //www. utc . fr/sysmetab"> 

="A_out" weight="l" pattern="xl" p o s i t i on =" 1 "> 

= "A_out" weight="l" pattern = "lx" p o s i t i on ="2"> 

'A_out" weight="2" pattern = "ll" p o s i t i on =" 1 "> 



Figure A. 5. Fragment of the XML file transformed after the first set of XSL transformation: enumeration of all cumomers sorted by weight and 
type (intermediate or input species). 
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1 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 
43 
44 
45 
46 
47 
48 
49 
50 
51 
52 
53 
54 
55 
56 
57 
58 
59 
60 



function [xl,x2, dxl _dv , dx2_dv ] = solveCumomers ( v , x 1 _input , x2_input ) 
nl=5; // Weight 1 cumomers 
Ml_ijv = [l,l,-(v(l) + v(2) + v(3)) 
2,2,-(v(l) + v(2) + v(3)) 

3 .2 , v(2) 
3,1 ,v(2) 

3.3, -(v(4) + v(4)) 
4,1 ,v(l) 
4,2,v(3) 

4,3 ,v(4) 

4.4, -v(5) 

5.2, v(l) 
5 ,1 ,v(3) 

5 .3 , v(4) 

5.5, -v(5)]; 

Ml= sparse (Ml_ijv( : , 1 :2 ) , Ml.ijv ( : , 3 ) , [ nl , nl ] ) ; 
b 1 _i j v = [1 ,1 , v(6). * x 1 .input ( 1 , : ) 
2,1 ,v(6).* xl .input (2 , : )] ; 

bl_l = s_full(bl_ijv(: ,1 :2),bl_ijv(: ,3) ,[nl ,1]); 

[ M 1 .handle , M 1 .rank ] = 1 u f a c t (Ml ) ; 
xl = lusolve (Ml.handle , — [b 1 _ 1 ]); 

dfl_dv_ijv=[l,6,xl_input(l,:) 

1.1, -xl (1 , : ) 

1.2, -xl (1 , : ) 

1.3, -xl (1 , : ) 

2.6, xl_input(2, : ) 

2.1, -xl (2 , : ) 

2.2, -xl (2 , : ) 

2.3, -xl (2 , : ) 
3 ,2 ,xl (2 , : ) 

3 ,2 ,xl (1 , : ) 

3.4, -xl (3 , : ) 

3.4, -xl (3 , : ) 
4,1 ,xl (1 , : ) 

4.3 ,xl (2 , : ) 

4.4 ,xl (3 , : ) 

4.5, -xl (4 , : ) 

5 .1 , xl (2 , : ) 
5 ,3 ,xl (1 , : ) 

5 .4 , xl (3 , : ) 

5.5, -xl (5 , : )] ; 

dfl_dv_l = s_full ( df l_dv_ijv(: ,1 :2) , df l_dv_ijv (: ,3) ,[nl ,6]); 
dxl_dv(: , : , 1 )= lusolve ( Ml .handle ,- dfl _dv_ 1 ); 

ludel(Ml_handle); 

n2 = 2; // Weight 2 cumomers 
M2_ijv = [l,l,-(v(l) + v(2) + v(3)) 
2,1 ,v(l) 
2,1 ,v(3) 

2.2, -v(5)]; 

M2=sparse(M2_ijv(: , 1 :2 ) , M2_ijv ( : , 3) , [ n2 , n2 ] ) ; 
b2_ijv = [1 ,1 ,v(6).* x2 .input (1 , : ) 
2,1 ,v(4).*xl (3 , : ).*xl(3 , : )] ; 

b2_l = s_full (b2_ijv(: ,1 :2),b2_ijv(: ,3) , [ n2 , 1 ] ) ; 

[M2_handle , M2_rank]= luf act (M2); 
x2=lusolve (M2_handle ,-[b2_l ]); 



Figure A. 6. The generated Scilab code for the Branching example (continued on Figure A. 7) 

16 



df2_dv_ijv =[1 ,6 , x2 .input (1 , : ) 




l,l,-x2(l , : ) 




l,2,-x2(l , : ) 




l,3,-x2(l , : ) 




2,1 ,x2(l , : ) 




2,3 ,x2(l , : ) 




2,4,xl(3,:).*xl(3,:) 




2,5,-x2(2 , : )] ; 




df2_dv_l = s_full(df2_dv_ijv(: , 1 :2 ) , df 2 _d v _ij v ( : , 3 ) , [ n2 , 6 ] ) ; 




db2_dxl_ijv=[2 ,3 ,xl (3 , : ).* v(4) 




2 ,3 ,xl (3 , : ).* v(4)] ; 




db2_dxl_l = sparse (db2_dxl _ijv ( : ,1 :2 ) , db2_dxl_ijv (: ,3) , [ n2 ,nl ]); 




dx2_dv = zeros(n2,6 ,1); 




dx2_dv(: , : , 1 ) = 1 u s ol v e ( M2_handle , - ( df 2 _dv _ 1 +db2_dx 1 _ 1 * dx 1 _dv ( : , : , 1 ))) ; 




ludel(M2_handle); 




endfunction 




function [cost , grad ] = cost AndGrad ( v ) 




[xl , x2 , dxl _dv , dx2_dv] = solveCumomers (v,xl_input ,x2_input); 




e _1 ab e 1 =(Cl*xl+C2*x2)— yobs ; 




e _f lu x =E* v— vobs ; 




c o s t = . 5 * ( sum ( d e 1 1 a . * e _ f 1 u x . ~ 2 ) + sum ( a 1 p h a ( : ,l).*e_label(: , 1 ) . " 2 ) ) ; 




grad=( delta .* e.flux ) '*E+( alpha (:, 1 ).* e .label (:, 1 ))' *(Cl*dxl _dv (:, : , 1 ) + C2* dx2_dv ( : 


: ,1)); 


endfunction 





Figure A.7. The generated Scilab code for the Branching example (begining of code is on Figure A.6) 
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Appendix B. Computation of the gradient in the non stationnary case 



The discrete state equation in its cascade form is easily obtained from (20) and the definition of f as 

(x fe - ^M fe ) x'^ 1 = (x k + x* fe + ^ (b,(x 4 ) + b fc (x 4+1 )) , 1 < i < N, (B.l) 

for A; = 1 . . . n. We recall that bfc(x) only depends on x; for I < k, so that the right-hand side of (B.l) is already 
known at stage k. To obtain x' k +1 at each time step i we just have have to solve a sparse linear system with a matrix 
whose LU factors need to be determined only once before the iterations. The cascade structure of discretized state 
and adjoint state equations is easily recovered. The discretized state equations (20) for weights k = 1 ... n, by 

(X„ - ^Mj) pr 1 = (x* + jMj) pi + \ •£ (g( x <)) T (p! + P r'), Ki<N. (B.3) 

As in the continous case, the adjoint states p k are obtained in decreasing weight order. The two components of the 
gradient are finaly obtained by 

' i=l k=l x ' 

and 

' j ii \ \ T N ~ 1 n / a \ T 

aJ(v, m) \ Y^ Y^ / " 



dm 



) = ££(^( D ^ m )K +1 -4))) 

i=l fc=l ^ 



where for a given vector z e M" fc the matrix ^(Xfe(m)z) is defined by 

/ d \ I Zi, if the cumomer fraction Zi belongs to metabolite j, 



\ 9m 



(X fc (m)z)) = < (B.4) 
0, otherwise. 
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